Back

The American Journal of Human Genetics

Elsevier BV

Preprints posted in the last 7 days, ranked by how well they match The American Journal of Human Genetics's content profile, based on 206 papers previously published here. The average preprint has a 0.20% match score for this journal, so anything above that is already an above-average fit.

1
Heterozygous MMACHC burden variants are associated with higher circulating vitamin B12 in the All of Us Research Program

Cai, L.; DeBerardinis, R. J.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354855 medRxiv
Top 0.1%
19.6%
Show abstract

Heterozygous carriers of autosomal recessive disease variants are conventionally considered unaffected, yet population-scale genomic datasets reveal subclinical carrier phenotypes. MMACHC encodes a cobalamin-processing protein whose biallelic loss causes cobalamin C deficiency, an inborn error of intracellular cobalamin metabolism. We performed an unbiased quantitative phenome-wide association screen in All of Us Research Program v8 to identify phenotypes associated with rare heterozygous MMACHC burden variants. Serum/plasma vitamin B12 was the top quantitative association. Carriers had higher circulating B12 than non-carriers in adjusted analyses, but also higher homocysteine, suggesting that elevated circulating B12 does not reflect improved intracellular cobalamin function. Carriers were less likely to fall below conventional B12 insufficiency thresholds, indicating a potential diagnostic blind spot. A pathway-wide rare-variant gene-burden (All-by-All) gene-burden analysis placed this finding in broader biological context. Burdens in genes related to circulating B12 binding or intestinal absorption were associated with lower circulating B12. In contrast, burdens in several genes involved in cellular delivery and intracellular cobalamin handling were associated with higher circulating B12. This step-specific directionality supports a model in which elevated circulating B12 can reflect impaired cellular handling and consequent systemic accumulation rather than improved cellular cobalamin availability. Because EHR-derived B12 is shaped by heterogeneous clinical and medication contexts, prospective carrier-enriched studies with standardized methylmalonic acid, homocysteine, diet, supplement, medication, comorbidity, and symptom ascertainment are needed to evaluate functional-marker-based screening.

2
Documented clinical genetic testing among carriers of hereditary breast and ovarian cancer variants: Ancestry and socioeconomic disparities in the All of Us research program

Yerukala Sathipati, S.; Scott, H.

2026-06-10 oncology 10.64898/2026.06.09.26355262 medRxiv
Top 0.2%
19.1%
Show abstract

Importance: Hereditary breast and ovarian cancer (HBOC) variant carriers benefit from risk-reducing interventions, but only if identified. The extent to which carriers are clinically recognized, and whether recognition is equitable across diverse populations, is poorly characterized in a single large U.S. cohort. Objective: To estimate P/LP HBOC carrier prevalence across genetic ancestry groups, quantify documented clinical genetic testing among carriers, and evaluate ancestry and socioeconomic disparities in testing. Design, Setting, and Participants: Cross-sectional analysis of the All of Us Research Program Controlled Tier (Curated Data Repository v8/C2024Q3R9), comprising participants with short-read whole genome sequencing and linked electronic health record (EHR) and survey data. Carriers were ascertained from research genomic data independent of clinical testing. Exposures: Genetically inferred ancestry (African [AFR], Admixed American [AMR], East Asian [EAS], European [EUR], Middle Eastern [MID], South Asian [SAS]); self-reported household income and educational attainment. Main Outcomes and Measures: (1) Carrier prevalence with Wilson 95% CIs; (2) documented clinical genetic testing (procedure codes) among carriers; (3) adjusted odds of documented testing among women, by ancestry, before and after socioeconomic adjustment, using multivariable logistic regression. Results: Among 414,830 participants, P/LP HBOC carrier prevalence was 1.42% (95% CI, 1.38-1.45) overall and similar across ancestry groups (AFR 1.24%, AMR 1.32%, EAS 1.19%, EUR 1.52%, MID 1.68%, SAS 1.33%; overlapping CIs). Among 250,071 women in the testing analysis, documented clinical genetic testing was rare: only 74 of 5,878 carriers overall (1.3%) and 59 of 3,572 European-ancestry carriers (1.7%) had a documented test, with counts below reportable thresholds in all other ancestry groups. African-ancestry women had lower adjusted odds of documented testing than European-ancestry women (Model 1 adjusted odds ratio [aOR], 0.32; 95% CI, 0.27-0.39), an association that attenuated but persisted after adjustment for income and education (Model 2 aOR, 0.48; 95% CI, 0.40-0.58; P < 0.001); Admixed American women also had reduced adjusted odds (aOR, 0.71; 95% CI, 0.61-0.84). Lower income and lower education were independently and dose-dependently associated with lower testing odds (income <$25,000 aOR, 0.46; high-school education aOR, 0.54). Conclusions and Relevance: High-risk HBOC variant carriers are present across all ancestry groups at similar frequencies, yet documented clinical genetic testing was disparate in the different ancestry groups. African-ancestry women experience a testing gap that is not fully explained by socioeconomic position, implicating structural barriers in access and referral. Population-level strategies that decouple carrier identification from current referral pathways may be required to close this gap.

3
Prioritizing embryos with lower homozygosity may reduce disease risk in children of related individuals undergoing preimplantation genetic testing

Wolfram, T.; Ahangari, M.; Davidson, I.; Wartschinski, L.; Li, J. H.; Eyre, M.; Stern, D.; Schleede, J.; Haghighi, A.; Carmi, S.; Christensen, M.

2026-06-04 genetic and genomic medicine 10.64898/2026.05.30.26354526 medRxiv
Top 0.2%
19.1%
Show abstract

Consanguinity is a reproductive union between individuals who share a recent common ancestor. These unions are common in many regions of the world and increase the burden of rare recessive disorders by elevating autozygosity in offspring. Current reproductive genetic screening focuses on a limited set of known pathogenic variants, leaving most recessive risk unaddressed. Here we argue that embryo-level autozygosity, quantified as the fraction of the genome in long runs of homozygosity (FROH), is a potentially actionable genomic biomarker that can be integrated into routine preimplantation genetic testing as a homozygosity-informed embryo-prioritization framework (PGT-H) that can be layered onto existing embryo biopsy workflows when couples are already undergoing IVF with PGT-A or PGT-M. Using forward simulations of first-cousin and double-first-cousin couples, we show that siblings conceived by the same couple span a wide range of FROH; selecting the lowest-FROH candidate from a cohort of five embryos reduces FROH by approximately 40% on average. Combining these reductions with empirical effect-size estimates, we estimate that for first-cousin couples this strategy could reduce risk of intellectual disability by roughly 35-45% (corresponding to an absolute risk reduction of about 1.8-2.2%) and potentially reduce excess recessive disease burden, while also modestly reducing risk of common diseases such as type 2 diabetes. We outline how existing PGT-A and PGT-M workflows could potentially be extended to report embryo-level FROH and discuss ethical and counseling considerations. Autozygosity-based embryo prioritization offers a principled way to address a component of recessive risk that current variant-centric approaches miss.

4
Investigating the Y chromosome in complex disease: Phenome-wide scan across 104,334 Finnish men

Preussner, A.; Leinonen, J. T.; FinnGen, ; Pirinen, M.; Tukiainen, T.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26355235 medRxiv
Top 0.3%
14.5%
Show abstract

Although the Y chromosome represents roughly 2% of the male genome, it is often ignored in genome-wide association studies (GWAS). Subsequently, the potential health impacts of Y-chromosomal genetic variation remain incompletely understood. To fill this gap, we performed a phenome-wide association study (PheWAS) in FinnGen across 1,426 binary and quantitative traits using Y-chromosomal variation (frequency [&ge;] 1%) in 104,334 genotyped men. As Y chromosome variation is prone to population stratification, we performed carefully adjusted association analyses and further examined these through kin-based validation in 19,275 female and 24,712 male 1st degree relatives. We found 121 suggestive (p < 5.6x10-3) phenotypic associations in the Y chromosome, yet none of these were strong enough to reach phenome-wide significance (p < 3.9x10-6). While only 38 associations were supported in the kin-based validation, intriguingly we found support for a previously suggested link between haplogroup I1 and coronary heart disease (CHD; OR=1.06, 95%CI=1.02-1.11, p=3.7x10-3; male validation OR=1.05; female validation OR=0.97). The I1-CHD association was detected across distinct geographical areas within Finland and was independent from Loss of Y (LOY) and the autosomal risk to CHD, proposing a link between germline Y-chromosomal variation and heart disease risk. Overall, this study presents a comprehensive phenome-wide analysis of Y-chromosomal associations, highlighting the potential relevance of Y-chromosomal variation beyond sex determination. Our findings further emphasize the need for improved capture of Y-chromosomal variants and further analyses in biobank-scale data to allow for deeper exploration of male-specific genetic architecture of complex diseases.

5
STELLAR: A flexible ensemble learning framework integrating rare variants to enhance polygenic risk prediction

Chen, T.; Li, X.; Mazumder, R.; Zhang, H.; Lin, X.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.07.26355109 medRxiv
Top 0.3%
14.0%
Show abstract

Whole-exome and whole-genome sequencing technology has enabled the discovery of rare genetic variants associated with human health and diseases. However, existing statistical methods used for rare variant association testing are not well-suited for building genetic risk prediction models that jointly incorporate rare and common variants. We propose STELLAR, a flexible ensemble learning-based approach to compute rare variant polygenic risk scores (PRS) using association summary statistics to enhance conventional common variant PRS. Our method combines burden-based and penalty-based rare variant analysis and leverages functional annotation information to prioritize potentially causal variants within the prediction models. In simulation studies, PRS using STELLAR consistently showed the highest prediction accuracy compared to models using common variants alone or rare variant burdens. Applied to UK Biobank whole-exome sequencing data (n=310,831) across eight continuous and five binary traits, STELLAR significantly improved prediction accuracy, refined stratification of individuals at the highest genetic risk beyond common variants, and prioritized biologically relevant genes. STELLAR provides a scalable strategy to incorporate rare variants into PRS in addition to common variants, advancing precision risk prediction and enabling more comprehensive assessment of genetic contributions to complex diseases.

6
Contextualizing the Utility of Polygenic Risk Scores using Absolute Risk Models in Diverse Ancestry Populations

Chatterjee, N.; Martina, F.; Kachuri, L.; Natarajan, P.; Witte, J.; Huo, D.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354842 medRxiv
Top 0.4%
12.3%
Show abstract

Polygenic risk scores (PRSs) are emerging as powerful tools for quantifying inherited risk for common diseases and, in some cases, are approaching clinical implementation. A major concern for PRS implementation is their limited accuracy in non-European populations, particularly in those of African ancestry. However, past evaluations have focused on metrics such as relative risk or AUC, which do not capture background risk arising from contextual factors. We introduce a novel measure of variable importance, the conditional average derivative estimator (CADE), to evaluate PRS utility across diverse contexts and populations within absolute risk models that integrate PRSs with other relevant risk factors. We illustrate this framework by integrating PRSs for breast and prostate cancer within age-specific absolute risk models for incidence and mortality fit using individual-level data from the All of Us Research Program with inputs from the National Cancer Institute SEER cancer registry. Our projections show that although the PRSs are known to have the lowest discriminatory accuracy in African Americans (AA), there are contexts in which they provide greater utility, such as for the stratification of prostate cancer risk and mortality, where the CADE values for AA were 2- and 7-fold higher than for European Americans. These findings suggest that conclusions about the limited clinical utility of PRS in non-European populations may be premature and underscore the need to quantify PRS risk-stratification utility at the absolute-risk level, while accounting for disease onset, survival, and broader health and economic factors.

7
Human genetic evidence links serine biosynthesis to diabetic peripheral neuropathy

Fridman, V.; Kakar, A.; Jensen, A.; Van de Vondel, L.; Wheeler, A.; Phillips, L. S.; Zhou, J.; Zuchner, S.; Reusch, J.; Raghavan, S.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26355286 medRxiv
Top 0.6%
8.2%
Show abstract

Diabetic peripheral neuropathy (DPN) is a common and disabling condition for which no disease-modifying therapies are available. Glycemic and metabolic drivers do not fully explain why only a subset of individuals with diabetes develop DPN, and genetic contributors remain poorly defined. We aimed to perform a multi-population genome-wide association study (GWAS) of DPN to highlight potential new etiological pathways and therapeutic targets. Methods We performed a multi-population GWAS of neuropathy in people with and without diabetes using the VA Million Veteran Program and UK Biobank, followed by replication in the All of Us Research Program (AoU), and gene-based and gene-set analyses to identify implicated pathways. Causal relationships between circulating serine levels and DPN were further tested using two sample Mendelian randomization. To further evaluate pathogenic potential, we analyzed rare, high impact variants in GWAS implicated genes among individuals with unresolved inherited neuropathies using the GENESIS platform. Findings Among individuals with type 2 diabetes, we identified seven genome wide significant loci (p<5x10-): PHGDH and PSPH (key serine synthesis genes), TEAD1, CYP4F11, LARGE1, FTO, and COBLL1. No loci were significant in individuals without diabetes or with type 1 diabetes. Four loci (PHGDH, TEAD1, FTO and CYP4F11) replicated in AoU (p <0.05). Mendelian randomization demonstrated that higher genetically predicted serine levels were associated with lower DPN risk, consistent with a causal role of serine metabolism in disease pathogenesis. Rare-variant burden analyses revealed associations of predicted deleterious variants with inherited neuropathy case status in PHGDH (odds ratio [OR] 12.7 [95% CI 7.9, 20.4]), PSPH (OR 8.5 [7.2, 10.2]), PHKG1 (OR 4.8 [3.7, 6.3]), and LARGE1 (OR 0.007 [0.0004, 0.1]). Interpretation Convergent genetic evidence across common and rare variation implicates serine synthesis as a key pathway in DPN. These findings link diabetic and inherited neuropathies through a shared metabolic mechanism, identifying serine metabolism as a potential therapeutic target.

8
Whole-exome-based preconception carrier screening in Uzbekistan with targeted SMA, FMR1, and DMD assays: the first reported clinical program

Kullyev, A.; Avdeichik, S.; Akimenkova, A.; Kartuesov, A.; Kardymon, O.; Goikhman, Y.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.02.26354713 medRxiv
Top 0.7%
6.4%
Show abstract

Abstract Purpose: Published clinical outcome data on preconception carrier screening (PCS) in Central Asia are limited. We report the first clinical implementation study from Uzbekistan of a whole-exome sequencing (WES)-based multi-platform PCS program combining exome sequencing with targeted SMA, FMR1, and DMD assays. Methods: We retrospectively analyzed anonymized data from 65 individuals (19 couples, 27 singletons) screened at IMC Genomics, Tashkent, between January 2024 and May 2026. WES covering the protein-coding regions of approximately 20,000 genes was followed by exome-wide bioinformatics filtering and clinical geneticist interpretation. Partly overlapping cohorts underwent SMA carrier screening (n=179), FMR1 CGG-repeat analysis in females (n=155), and DMD deletion/duplication testing in preconception females (n=29). Variants were classified by ACMG/AMP criteria against gnomAD v4.1. Results: Sixty-one of 65 WES-screened individuals (93.8%; 95% CI 85.2 - 97.6%) carried at least one reportable variant (152 instances across 126 genes). Four of 19 couples (21.1%; 95% CI 8.5 - 43.3%) were concordant for pathogenic or likely pathogenic variants in the same autosomal recessive gene; two were referred for preimplantation genetic testing for monogenic disease. SMA screening identified four carriers, including two 2+0 silent carriers; FMR1 analysis identified one intermediate allele; DMD MLPA identified no exonic rearrangements. Conclusion: This first reported WES-based multi-platform PCS program in Uzbekistan was feasible and clinically informative, identifying actionable couple-level reproductive risks and supporting structured implementation of reproductive genetic screening in Central Asia.

9
Phenome-wide association of multiallelic copy number variation in 422,170 UK Biobank individuals reveals novel genetic loci associated with disease

Eisenberg, M.; Packer, R.; Shrine, N.; Demidov, G.; Pack, H.; Hollox, E. J.; Fawcett, K.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354825 medRxiv
Top 0.8%
6.4%
Show abstract

The contribution of multi-allelic CNVs (mCNVs) to disease risk has not been widely studied. This is largely because they have been difficult to characterise at a large-scale genome-wide, and are often not strongly associated with flanking SNVs, limiting imputation. Improved understanding of the role of mCNVs in disease risk could lead to novel insights into the pathobiology of disease. We robustly typed 69 mCNVs from UK Biobank whole exome sequences in discovery (n=150,682) and replication sets (n=269,317). Discovery and replication PheWAS used clinically-curated composite phenotypes by integrating self-report, primary and secondary health care data to interrogate these variants, for unrelated British individuals of African, European and Central/South Asian ancestries. 173 mCNV-phenotype associations were detected from 26 mCNVs, of which 114 associations replicated. One of eight potentially novel mCNV-phenotype signals was independent of neighbouring associated SNVs, the association of Sulfotransferase 1A1 and 1A2 genes (SULT1A1/SULT1A2) with estimated glomerular filtration rate (eGFR) in individuals of European ancestry (meta-analysed p=1.05x10-9, beta=0.016 [0.011; 0.021]). Other potentially novel associations include Golgi phosphoprotein 3 (GOLPH3) with the cardiovascular phenotype bundle branch block in individuals of South Asian ancestry (meta-analysed p=3.35x10-6, OR=2.13 [1.53, 2.96]) and alpha amylase 2B (AMY2B) with ventricular fibrillation and flutter in individuals of European ancestry (meta-analysed p=2.48x10-6, OR=1.50 [1.26; 1.78]). In summary, we show that accurate typing of biobank-scale sample sizes can identify associations between traits and mCNVs, acting through a gene dosage relationship. Our work provides several novel likely causative variants contributing to particular traits of clinical importance and immediately suggest a putative functional mechanism for the observed associations.

10
Parental educational attainment polygenic scores contribute to phenotypic heterogeneity in offspring with autism

Gao, S.; Sui, Y.; Tian, P.; Rao, X.; Yan, C.; Xu, Y.; Wang, T.

2026-06-08 genetic and genomic medicine 10.64898/2026.06.03.26354779 medRxiv
Top 1%
3.6%
Show abstract

Educational attainment-related polygenic scores have been implicated in autism spectrum disorder (ASD), but how parental polygenic scores shape offspring phenotypes remains unclear. Using genotyping and exome-sequencing data from 142,357 individuals (55,252 ASD cases) in a large ASD cohort, we dissected the direct and indirect genetic effects of educational attainment-related polygenic scores on ASD phenotypes. Trio-model analyses showed that parental polygenic scores for educational attainment (PGSEA ) were associated with milder core ASD symptoms, including social deficits and repetitive behaviors, predominantly through indirect genetic effects, whereas their associations with comorbidities were driven predominantly by direct genetic effects. PGSEA was also significantly negatively associated with rare variant burden and prenatal factors, although these factors contributed largely independently to most phenotypes. Adjustment for full-scale intelligence quotient (FSIQ) and socioeconomic status (SES) partially attenuated the indirect effects of PGSEA on offspring phenotypes. Finally, higher parental PGSEA was associated with later age at diagnosis in offspring, partly through its protective effects on ASD phenotypes. These findings indicate that indirect genetic effects of parentalPGSEA contribute substantially to phenotypic variation in ASD and highlight family-mediated pathways as an important component of ASD heterogeneity.

11
Incremental Clinical Value of Single-Molecule Nanopore Sequencing in Thalassemia Testing: A Prospective Double-blind, Multicenter Study

Xiang, J.; Zhu, B.; Xu, H.; Chen, Y.; Sun, X.; xiang, r.; Zhao, Y.; Liu, W.; Zhang, L.; He, J.; liu, j.; Chen, Y.; Fan, Z.; Zhang, H.; Tan, J.; Pang, L.; Shi, L.; Kong, Y.; Cai, A.

2026-06-09 hematology 10.64898/2026.06.09.26354559 medRxiv
Top 1%
3.2%
Show abstract

Background Thalassemia is one of the most common monogenic disorders worldwide, current screening strategies combining hematological testing with molecular assays still carry a risk of missed diagnoses and undesirable efficiency, particularly for complex structural variants and rare mutations. Methods In this prospective double-blind, multicenter cohort study of 3,842 participants (3,362 pregnant women and 480 male partners), we conducted a head-to-head comparison to systematically evaluate the incremental clinical value and detection performance of single-molecule nanopore sequencing in thalassemia (SMITH) against conventional hematological testing and next-generation sequencing (NGS). Findings The overall concordance rate between NGS and SMITH was 98.6% (3789/3842). The discrepant cases (n=53) were directly attributed to the superior detection capabilities of SMITH, which successfully identified complex structural rearrangements-including 45 -globin gene triplications and four HK alleles-that were missed by NGS. Furthermore, SMITH accurately detected four rare variants (c.134_135insT/, c.-22(C>T)/, {beta}N/{beta}c.316-290delinsAGGGCAATAATTT and {beta}3.5 kb deletion/{beta}N ) and resolved ten trans and three cis configurations within the globin gene allele. Clinically, these technical advantages translated to a 9.3% (5/54) increase in the detection rate of high-risk prenatal couples, effectively preventing one birth affected by moderate-to-severe thalassemia. Additionally, SMITH corrected a diagnostic discrepancy in one case (HK vs. -3.7), sparing the couple from an unnecessary invasive procedure. Interpretation Our findings demonstrate that SMITH provides a powerful platform for resolving globin gene rearrangements, detecting rare variants, and enabling direct haplotype phasing. By effectively eliminating diagnostic blind spots, SMITH is expected to become an optimal method for thalassemia prevention programs. Funding This study was supported by Chinese National Natural Science Foundation Projects 81760037 and 82271894.

12
Breast cancer polygenic risk score performance varies by socioeconomic status

Domian, H. I.; Tian, X.; Ong, D.; Hamilton, L.; Shieh, Y.; Musharoff, S. A.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354819 medRxiv
Top 2%
2.5%
Show abstract

Background: Polygenic risk scores (PRS) for breast cancer are increasingly used for risk stratification to inform screening and prevention. However, for PRSs to be equitable and clinically useful, they need to perform well across diverse populations. While PRS performance is known to be ancestry-dependent, it is not well understood how environmental context, such as that of socioeconomic status (SES), affects PRS transferability. Here, we assess whether SES, measured via self-reported household income, modifies breast cancer PRS performance and, if so, whether socioeconomic context contributes predictive information beyond genetic risk alone. Methods: We used the US-based All of Us biobank to evaluate how SES impacts breast cancer PRS performance. First, we quantified changes in breast cancer PRS performance by modeling a commonly-cited polygenic score for breast cancer previously described by Mavaddat et al. with SES. We then reestimated the genetic effect sizes of the 3,820 variants from Mavaddat et al. in All of Us with and without income as a covariate. Because social determinants of health affect breast cancer detection and outcomes, we stratified analyses by socially defined populations on the basis of self-identified race and ethnicity. We further stratified individuals whose self-identified race is White (''White'') into three SES groups (high, middle, low) based on self-reported income and re-estimated genetic effect sizes to create SES-specific PRSs. We then applied these PRSs to White participants, the largest group in the study, and to Black or African American (''Black'') and Hispanic or Latino (''Hispanic'') participants, groups underrepresented in breast cancer research. Model discrimination between cases and controls was measured by area under the curve (AUC). Results: We analyzed 163,715 women from the All of Us biobank, which included 8,833 breast cancer cases (6,619 White, 1,178 Black, and 1,036 Hispanic), with relative income available for a subset of these cases (5,525 White, 848 Black, and 566 Hispanic). The ancestry-dependent performance of the breast cancer PRS described in Mavaddat et al. was replicated in All of Us. In Black individuals, this PRS (AUC and 95% CI: 0.576 [0.571, 0.582]) produced a similar increase in AUC as relative income (AUC: 0.573 [0.568, 0.577]) when added to an age-only model. Incorporating income with PRS, age, and genetic PCs 1-3 improved AUC by 0.007 in White Americans and 0.018 in Black Americans (both p < 10-11), while attenuating the contribution of PRS in the full model. PRS performance also varied among SES categories. Notably, PRSs with variant effect sizes that were recalibrated in low-SES White participants performed best in low-SES White participants (AUC: 0.605 [0.583, 0.628]) and Black Americans (AUC: 0.588 [0.586, 0.591]), both better than performance in high-SES White Americans (AUC: 0.579 [0.577, 0.580]) and middle-SES White Americans (AUC: 0.578 [0.569, 0.586]). Conclusion: Socioeconomic context, measured by income, significantly impacts the transferability of a PRS for breast cancer within and among groups defined by self-identified race and ethnicity. Accounting for SES improves PRS performance, most notably in Black Americans and low-SES White individuals.

13
A mechanistic model for genetic regulation of postmenopausal bone loss

Rattsev, I.; Mac Gabhann, F.; Hertz, D.; Taylor, C. O.

2026-06-08 endocrinology 10.64898/2026.06.04.26354968 medRxiv
Top 2%
1.7%
Show abstract

Bone remodeling is a tightly regulated physiological process that maintains bone health through coordinated action of bone-resorbing osteoclasts and bone-forming osteoblasts. Disruption of this balance, such as the one induced by estrogen decline after menopause, results in bone loss and osteoporosis. Genetic factors play an important role in determining bone mineral density (BMD) loss over time. However, translating genetic associations into individualized risk prediction remains challenging due to small effect size of individuals variants and non-linear interactions within the bone remodeling unit. Here, we present a bone cell population dynamics model that includes major regulatory pathways, such as the RANK/RANKL/OPG axis, Wnt signaling, and hormonal regulation by estrogen, parathyroid hormone, and TGF-{beta}. We calibrate the model on clinical data from healthy postmenopausal women, and women with reduced BMD undergoing anti-osteoporotic therapy. The calibrated model captures healthy BMD decline in postmenopausal women and therapeutic response to anti-osteoporotic medications. We mechanistically incorporate the effect of 22 variants across 8 genes involved in bone remodeling and simulate BMD trajectories in 1,000 virtual subjects differing by ancestry and genetic makeup. The median predicted 5-year BMD loss was 3.57% (95% prediction interval: 1.31-5.24), consistent with the values reported in the literature. The virtual individuals with African ancestry were predicted to experience the highest average 5-year BMD loss. The strongest genetic risk factors for bone loss were predicted to be CYP19A1 rs727479 and OPG rs3102735, while LRP5 rs11228240 emerged as a protective factor that could partially counteract the detrimental effects of other variants. Several epistatic effects were observed in the genetic interaction analysis. Mechanistically, our model suggested that estrogen exerts its effect on bone remodeling primarily by modulating osteoclast apoptosis. Overall, this framework demonstrates a proof-of-concept for integration of genetic risk factors into mechanistic models of disease and can be extended to other conditions with polygenic inheritance.

14
Multi-ancestry genome-wide association study and meta-analysis of stimulant use disorder reveals biology and relationships to other psychiatric disorders

Beck, S. E.; Deak, J. D.; Levey, D. F.; Ge, T.; Jeffries, P. W.; Lai, D.; Mallard, T. T.; Degenhardt, L.; Lind, P. A.; Tollerup Nielsen, T.; Tubbs, J. D.; Wetherill, L.; Johnson, E. C.; Hatoum, A. S.; The SUD Working Group of the Psychiatric Genomics Consortium, ; COGA Collaborators, ; Yale-Penn Collaboration, ; The VA Million Veteran Program, ; Borglum, A.; Demontis, D.; Medland, S. E.; Martin, N. G.; Nelson, E. C.; Smoller, J. W.; Kranzler, H. R.; Gaziano, J. M.; Stein, M. B.; Agrawal, A.; Edenberg, H. J.; Gelernter, J.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.05.26354997 medRxiv
Top 2%
1.6%
Show abstract

Stimulant use disorder (StimUD) is a significant public health problem, but genetic studies have been limited by small sample sizes. We conducted genome-wide association studies (GWAS) of StimUD in the Million Veteran Program (MVP) and All of Us (AOU), followed by meta-analysis with FinnGen and 10 additional datasets, for a total of 709,369 individuals (Ncases=33,977, Ncontrols=675,392) in four broad ancestry groups: European (EUR) (Ncases=22,564, Ncontrols=624,672), African (AFR) (Ncases=7,574, Ncontrols=34,189), Admixed American (AMR) (Ncases=3,657, Ncontrols=15,698), and East Asian (EAS) (Ncases=182, Ncontrols=833). Population-specific SNP heritability was 6.1% in EUR and 2.4% in AFR. We discovered a total of 19 genome-wide-significant loci, six in EUR, including DRD2*rs5794864, P=7.32E-10, one in AFR, five in a multi-ancestry meta-analysis, including CHRNA5*rs55781567, P=3.27E-9, two in a male-only meta-analysis, including FTO*rs8057044, P=9.50E10-9, and five in a meta-analysis of sex-stratified results. In a hold-out AOU subsample (NEUR=18,841, NAFR=12,263, NAMR=9,739), ancestry-specific polygenic risk scores were significantly associated with StimUD in EUR (OR=3.28, 95% confidence interval (CI)=2.89-3.71) and AMR (OR=2.01, 95% CI=1.71-2.37). Transcriptome-wide association studies, fine-mapping, and colocalization analyses prioritized additional genes (e.g., GPX1, BSN). Genetic correlation, Mendelian randomization, and causal mixture analyses revealed relationships with other substance use and use disorder phenotypes, including cannabis use disorder (rg=0.94, P=5.43E-237) and opioid use disorder (rg=1.01, P=4.40E-107), and other psychiatric traits, including anxiety, depression, neuroticism, and attention-deficit/hyperactivity disorder. This is the first well-powered GWAS of StimUD, and it offers significant insights into disease biology.

15
HbF/F-cell and the Phenotype of Sickle Cell Disease

Wilks, A.; Lofters, J.; Lee, J.; Milton-Hicks, J.; Klings, E.; Steinberg, M.

2026-06-04 hematology 10.64898/2026.06.02.26354737 medRxiv
Top 2%
1.5%
Show abstract

Fetal hemoglobin (HbF) prevents the polymerization of sickle hemoglobin (HbS). HbF, measured usually as a percent of total hemoglobin (%HbF), is inversely associated with the severity of sickle cell disease (SCD) but fails to capture the distribution of HbF concentrations within red blood cells (RBCs). The relative proportion of HbF and HbS within a RBC is reflected by the HbF:HbS ratio whereas HbF/F-cell quantifies the absolute amount of HbF/RBC. While correlated, HbF:HbS ratio and HbF/F-cell are not interchangeable. In the context of mean corpuscular hemoglobin (MCH), HbF/F-cell approximates whether sufficient HbF is present to inhibit HbS polymerization. We examined the association of mean HbF/F-cell with sub-phenotypes of sickle cell disease in three independent cohorts. Both %HbF and HbF/F-cell were significantly associated with multiple clinical and laboratory features of SCD; however, HbF/F-cell demonstrated stronger associations with clinical severity measures across cohorts. Higher HbF/F-cell was associated with fewer clinical events, reduced hemolysis, and mortality. Changes in HbF/F-cell after hydroxyurea treatment were associated with ~11-13% reduction in acute events in patients with <1 pg increase and >60% reduction with a >5 pg increase in HbF/F-cell. For each pg increase in HbF/F-cell there was ~6% reduction in the rate of acute events. As a surrogate for the distribution of HbF concentrations among F-cells, HbF/F-cell adds physiologically relevant insights that could guide prognosis and treatment

16
A liquid biopsy-centered, pan-cancer, open next generation sequencing panel to support clinical decision-making (LION panel)

Feierabend, S.; Künstner, A.; Forster, M.; Helbing, T.; Gebauer, N.; Gemoll, T.; Axt, F.; Nimmagadda, S. C.; Ranganathan, L.; Schwandt, J.; Heber, M.; Szymczak, S.; Hohensee, I.; Fliedner, S. M. J.; Scherer, F.; Oberländer, M.; Derer-Petersen, S.; Busch, H.; von Bubnoff, N.; Dazert, E.

2026-06-08 oncology 10.64898/2026.06.05.26354976 medRxiv
Top 3%
1.3%
Show abstract

Cancer treatment has shifted toward personalized therapy based on molecular profiling, particularly in advanced disease. Existing circulating tumor DNA panels are often broad, generating many non-actionable variants and incurring costs that limit routine use in molecular tumor boards. We developed and validated a manufacturer-independent, 109-gene liquid biopsy-centered pan-cancer open next generation sequencing panel (LION panel), combined with an in-house bioinformatic pipeline to support clinical decision-making. A total of 87 samples were analyzed, including 17 reference samples, 21 healthy blood donor controls, and 49 patient samples including nine tumor entities. The LION panel achieved 92% sensitivity and 99% specificity in reference samples, with high concordance to digital droplet PCR (r = 0.99). It detected variant allele frequencies as low as 0.05% (tumor-informed) and 0.5% (tumor-uninformed). Clinical concordance reached 82% with blood-based digital droplet PCR and 75% with whole exome tissue sequencing. In representative cases, variant dynamics correlated with disease progression and revealed additional targetable variants. Overall, the LION panel supports clinical decision-making by enabling identification of targetable variants, disease monitoring, and detection of treatment resistance, particularly when tumor tissue is unavailable.

17
More Than Results: A Qualitative Study on the Role of Person-Centered Genetic Counseling in Parkinson Disease Research

Verbrugge, J.; Fiallos, K.; Cook, L.; Miller, M.; Head, K. J.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.03.26354465 medRxiv
Top 3%
1.0%
Show abstract

As genetic testing becomes increasingly integrated into Parkinson disease (PD) research, including targeted testing for variants in LRRK2 and GBA1, the return of individual research results is becoming more common. However, limited qualitative data exists regarding how research participants experience genetic results disclosure and post-test genetic counseling in PD research settings. We conducted semi-structured qualitative interviews with participants (n=13) enrolled in the Parkinson Precision Medicine Initiative (formerly Parkinson Progression Markers Initiative; PPMI) who had received PD-related genetic test results and post-test genetic counseling. Interviews were conducted 1 to 3 weeks following result disclosure and analyzed using thematic analysis with a primarily deductive coding approach informed by study aims and inductive identification of emergent themes. Four primary themes were identified: (1) personal connection and motivations for participation, (2) centrality of result disclosure and information preferences, (3) emotional experiences and support needs, and (4) communication quality and alignment with participant needs. Overall, our findings underscore the importance of person-centered genetic counseling within PD research. As return of genetic and biomarker results in research and clinical trial contexts expand, thoughtful integration of relational, informational, and communication-focused practices will be essential to support participant engagement and trust.

18
Distinct and shared genetics of kidney filtration function versus albuminuria revealed by multi-trait GWAS

de Hesselle, H. C.; Garben, B.-F.; Stark, K. J.; Warth, R.; Teumer, A.; Pattaro, C.; Heid, I. M.; Winkler, T. W.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.08.26355141 medRxiv
Top 3%
1.0%
Show abstract

Chronic kidney disease is characterized by decreased glomerular filtration rate (eGFR, estimated from serum creatinine or cystatin C) or increased urinary albumin-to-creatinine-ratio (UACR). Genome-wide association studies provided the genetic make-up of these traits, but their overlap remained largely unknown. Our multi-trait GWAS (N=1M) identified 812 signals and multi-trait fine-mapping sharpened the identification of likely causal variants. Of 333 signals classified for filtration function or albuminuria, only 11 overlapped. Their effects on eGFR and UACR were directionally concordant, dominated by eGFR and independent of HbA1c or mean arterial pressure. Mapped genes pinpointed mechanisms related to glomerular filtration area (SHROOM3, EPB41L5) and sodium-mediated intraglomerular pressure (NRBP1, DPEP1/CHMP1A). Genetics of fluid intake resulted in shadow effects on UACR without albumin leakage into urine. Our multi-trait approach sharpened the identification of likely causal genes for kidney traits, demonstrated largely distinct genetics for filtration function versus albuminuria, and provided new biological insights into the overlap.

19
Clonal Hematopoiesis of Indeterminate Potential Refines Cardiovascular Risk Stratification in Cardiovascular-Kidney-Metabolic Syndrome Stages 0-3

Lu, J.; Sun, S.; Deng, Z.; Wang, S.; Wei, C.; Jiang, S.; Li, W.

2026-06-08 epidemiology 10.64898/2026.06.04.26354963 medRxiv
Top 3%
0.9%
Show abstract

Background: Chronic low-grade inflammation drives cardiovascular-kidney-metabolic (CKM) syndrome. Clonal hematopoiesis of indeterminate potential (CHIP), an age-related driver of systemic inflammation, is linked to several cardiometabolic disorders. However, whether CHIP modifies CKM progression and contributes to heterogeneity in cardiovascular disease (CVD) risk within the CKM framework remains uninvestigated. Methods: This cohort study included 307,025 UK Biobank participants at CKM stages 0-3 free of baseline CVD. CHIP status was identified via whole-exome sequencing (WES). The association between CHIP and baseline CKM severity was examined, along with the independent and joint effects of CHIP and CKM stages on incident CVD risk. The joint effects of CHIP and polygenic risk scores (PRS) were further assessed, and the incremental predictive value of incorporating CHIP into the AHA PREVENT equations was evaluated. Results: CHIP carriers were more likely to present with advanced CKM stages [OR 1.14 (1.09-1.20), P < 0.001] and exhibited higher incident CVD risk during follow-up [HR 1.13 (1.08-1.18), P < 0.001]. Significant joint effects between CHIP and CKM stages were observed, with the highest risk among CHIP carriers at CKM stage 3 [HR 1.63 (1.50-1.78), P < 0.001]. Large or multiple CHIP mutations conferred greater hazards, with distinct gene-specific effects observed. Moreover, CHIP and high genetic risk also jointly amplified CVD susceptibility. Most importantly, incorporating CHIP into AHA PREVENT significantly improved risk discrimination. Conclusions: CHIP is a significant risk factor associated with more advanced CKM stages and amplifies incident CVD risk. Integrating CHIP into existing prevention strategies may refine CVD risk stratification.

20
A single-nucleus transcriptomic atlas of human basal ganglia during development forwarding diagnosis and therapy of pediatric movement disorders

Lange, B. K. A.; Graceffo, E.; Stenzel, W.; Biebermann, H.; Schuelke, M.; Wilpert, N.-M.

2026-06-04 nephrology 10.64898/2026.06.04.26354648 medRxiv
Top 3%
0.9%
Show abstract

Gene therapy is rapidly emerging as a transformative treatment for monogenic neurological disorders, including pediatric movement disorders such as aromatic L-amino acid decarboxylase (AADC) deficiency. However, its success critically depends on defining target cells and windows for therapeutic intervention. Here, we present an open-access single-nucleus transcriptomic atlas of the human basal ganglia spanning a therapy-relevant window from second/third trimester to the perinatal period and adulthood. Across 35,755 nuclei, we identify major (non-)neuronal cell types, retrace developmental trajectories, and characterize gene-regulatory networks. We identify so far unrecognized human-specific expression of key neuronal signaling genes, including GNAO1 and ADCY5, and discuss the implications for targeted gene replacement therapies. Unexpectedly, we found that the Huntingtin gene (HTT) is already expressed during prenatal stages of human brain development, supporting a previously proposed neurodevelopmental component of Huntington's disease, which should be considered in diagnostic and therapeutic strategies. Moreover, FOXG1 expression and regulon activity are predominantly located in a prenatal time window, suggesting constraints on the effectiveness of postnatal interventions. Our findings highlight the importance of datasets capturing human brain development in real time and provide a publicly available resource to guide precision gene therapy strategies in the future.